Skip to content

Conversation

@Mangaal
Copy link

@Mangaal Mangaal commented Sep 16, 2025

This PR introduces a unidirectional(agent → principal) log streaming service and wires it into the resource-proxy path so the Principal can serve Kubernetes pod logs to the Argo CD UI. The Agent handles both static logs (follow=false) and live streaming (follow=true) with resume support.

What’s included:

  • New LogStreaming service (gRPC) — Agent opens a client-streaming RPC and pushes log chunks keyed by request_uuid; Principal writes directly to the HTTP response stream and returns a final status when the stream ends.
  • Principal resource-proxy integration — /…/pods/{name}/log requests are recognised, the HTTP writer is registered, and a log event is enqueued to the Agent.
  • Agent log workers — static and live log handlers; time-window flush or 64KiB chunk flush; live streaming has resume (SinceTime) on transient errors.

Key feature:

  • Principal LogStream gRPC server & HTTP bridge.
  • Agent log streaming implementation (static + live + resume).
  • Principal resource proxy: log-subresource branch & handoff to LogStream.

Assisted-by: Cursor/Gemini etc

logs.mov

@Mangaal Mangaal force-pushed the log-streaming branch 2 times, most recently from 4d5e132 to 30aab14 Compare September 16, 2025 14:58
@Mangaal Mangaal closed this Sep 16, 2025
@Mangaal Mangaal reopened this Sep 16, 2025
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit 2a08301)
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit d07df62)
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit 161f2a4)
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit 30aab14)
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit f8a6666)
Signed-off-by: Mangaal <[email protected]>
(cherry picked from commit e820c35)
@codecov-commenter
Copy link

codecov-commenter commented Sep 17, 2025

Codecov Report

❌ Patch coverage is 44.31818% with 294 lines in your changes missing coverage. Please review.
✅ Project coverage is 45.79%. Comparing base (f957238) to head (66bace8).

Files with missing lines Patch % Lines
agent/log.go 33.86% 154 Missing and 12 partials ⚠️
principal/resource.go 12.50% 49 Missing and 7 partials ⚠️
internal/event/event.go 0.00% 43 Missing ⚠️
principal/apis/logstream/logstream.go 83.73% 23 Missing and 4 partials ⚠️
agent/inbound.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #569      +/-   ##
==========================================
- Coverage   45.85%   45.79%   -0.06%     
==========================================
  Files          90       92       +2     
  Lines        9690    10200     +510     
==========================================
+ Hits         4443     4671     +228     
- Misses       4801     5065     +264     
- Partials      446      464      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Mangaal Mangaal marked this pull request as ready for review September 17, 2025 12:01
@chetan-rns
Copy link
Collaborator

@Mangaal I see this error intermittently on the UI. Works fine after requesting the logs again. I guess we are not handling EOF somewhere?

Get "https://rathole-container-internal:9090/api/v1/namespaces/
guestbook/pods/kustomize-guestbook-ui-7689b675bc-cbv8h/log?container=guestbook-ui&follow=true&
tailLines=1000&timestamps=true": EOF

@Mangaal
Copy link
Author

Mangaal commented Sep 30, 2025

@chetan-rns, Thanks for reviewing my PR. I’ve updated it and addressed your suggestions. Please take a look when you get a chance.

Copy link
Collaborator

@chetan-rns chetan-rns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mangaal Sorry for the delay. Added a few questions around simplifying the agent logic. IMO, the agent should only propagate the options from the argocd server to the client-go's GetLogs(). Read the bytes from the reader until EOF, forward them back in chunks, and return any errors. We can avoid extracting timestamps to modify the sinceTime dynamically. I think we can rely on the argocd server to handle the chunks. This way the agent doesn't have to do any extra work. WDYT @jannfis

@Mangaal Mangaal requested a review from mikeshng as a code owner October 16, 2025 09:34
Copy link
Collaborator

@chetan-rns chetan-rns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Mangaal! The overall PR looks good to me.

@jannfis @jgwest Please take a look when you have a moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants